Code
start_time <- Sys.time()Rodrigo Hermont Ozon, Érick Oliveira Rodrigues
Invalid Date
This small document have the goal to share the time series extraction and the two basic features building, like price returns and their conditional variance…
[… to be written …]
Loading time series data, for portfolio setting…
Obtain daily prices and their returns:
[ 0% ]
[************ 25% ] 2 of 8 completed
[****************** 38% ] 3 of 8 completed
[**********************50% ] 4 of 8 completed
[**********************62%***** ] 5 of 8 completed
[**********************75%*********** ] 6 of 8 completed
[**********************88%***************** ] 7 of 8 completed
[*********************100%***********************] 8 of 8 completed
Plotting the time series prices (in level):
Obtain the returns time series (first feature):
\[ \mbox{Price log returns}_t = ln(p_t) - ln(p_{t-1}) \]
And plot it:
# Plot time series of log returns one by one
for column in portfolio_log_returns.columns:
plt.figure(figsize=(10, 6))
sns.lineplot(x=portfolio_log_returns.index, y=portfolio_log_returns[column])
plt.title(f'Time Series of {column} Log Returns')
plt.xlabel('Date')
plt.ylabel('Log Returns')
plt.grid(True)
plt.show()The GARCH(1,1) model with an asymmetric Student-t distribution is not directly available in most Python libraries. However, we can still use a GARCH(1,1) model with a standard Student-t distribution to estimate the conditional variance. The GARCH(1,1) model is represented as follows:
\[ r_t = \mu + \epsilon_t \]
\[ \epsilon_t = \sigma_t z_t, \quad z_t \sim t_{\nu}(0, 1) \]
\[ \sigma_t^2 = \omega + \alpha \epsilon_{t-1}^2 + \beta \sigma_{t-1}^2 \]
Where:
# Estimate GARCH(1,1) model for each asset and extract conditional variances
conditional_variances = pd.DataFrame(index=portfolio_log_returns.index)
for column in portfolio_log_returns.columns:
model = arch_model(portfolio_log_returns[column], vol='Garch', p=1, q=1, dist='t')
model_fitted = model.fit(disp='off')
conditional_variances[column] = model_fitted.conditional_volatility ** 2
# Plot time series of conditional variances one by one
for column in conditional_variances.columns:
plt.figure(figsize=(10, 6))
sns.lineplot(x=conditional_variances.index, y=conditional_variances[column])
plt.title(f'Conditional Variance of {column} (GARCH(1,1) Estimated)')
plt.xlabel('Date')
plt.ylabel('Conditional Variance')
plt.grid(True)
plt.show()library(tidyverse)
library(dplyr)
library(ggplot2)
#library(plotly)
library(rugarch)
library(timeSeries)
library(fPortfolio)
library(quantmod)
library(caTools)
library(PerformanceAnalytics)
library(MASS)
library(PortfolioAnalytics)
library(ROI)
require(ROI.plugin.glpk)
require(ROI.plugin.quadprog)
library(quadprog)
library(corpcor)
library(DEoptim)
library(cowplot) # devtools::install_github("wilkelab/cowplot/")
library(lattice)
library(timetk)Loading time series data, for portfolio setting…
Obtain daily prices and their returns:
portfolioPrices <- NULL
for ( Ticker in tickers )
portfolioPrices <- cbind(
portfolioPrices,
getSymbols.yahoo(
Ticker,
from = "2019-01-01",
auto.assign = FALSE
)[,4]
)
portfolioPrices <- portfolioPrices[apply(portfolioPrices, 1, function(x) all(!is.na(x))),]
colnames(portfolioPrices) <- c(
"corn_fut",
"wheat_fut",
"KCWheat_fut",
"rice_fut",
"Feeder_Cattle",
"soymeal_fut",
"soyF_fut",
"soybeans_fut"
)
tail(portfolioPrices) corn_fut wheat_fut KCWheat_fut rice_fut Feeder_Cattle soymeal_fut
2024-09-27 418.00 384.75 576.75 1508.0 247.075 1065.75
2024-09-30 424.75 392.50 583.75 1529.5 246.200 1057.00
2024-10-01 429.00 388.00 598.25 1530.0 246.150 1057.25
2024-10-02 432.50 389.75 619.25 1516.5 249.725 1056.00
2024-10-03 428.25 383.75 611.50 1517.0 248.975 1046.00
2024-10-04 424.75 388.25 598.00 1509.5 249.625 1037.75
soyF_fut soybeans_fut
2024-09-27 343.7 42.18
2024-09-30 344.3 43.51
2024-10-01 350.0 42.91
2024-10-02 341.4 43.72
2024-10-03 332.8 44.69
2024-10-04 330.5 44.04
Plotting the time series prices (in level):
portfolioPrices |> as.data.frame() |>
mutate(
time = seq_along( corn_fut )
) |>
pivot_longer(
!time,
names_to = "Variables",
values_to = "Value"
) |>
group_by(Variables) |>
plot_time_series(
time,
Value,
.interactive = F, # Change for TRUE for better visualization
.facet_ncol = 2,
.smooth = FALSE
) +
theme(
strip.background = element_rect(fill = "white", colour = "white")
)Obtain the returns time series (first feature):
\[ \mbox{Price log returns}_t = ln(p_t) - ln(p_{t-1}) \]
# Calculate log returns for the portfolio prices
portfolioReturs <- na.omit(diff(log(portfolioPrices))) |> as.data.frame()
colnames(portfolioReturs) <- c(
"ret_corn_fut",
"ret_wheat_fut",
"ret_KCWheat_fut",
"ret_rice_fut",
"ret_Feeder_Cattle",
"ret_soymeal_fut",
"ret_soyF_fut",
"ret_soybeans_fut"
)
glimpse(portfolioReturs)Rows: 1,449
Columns: 8
$ ret_corn_fut <dbl> 0.0105891128, 0.0085218477, -0.0019601444, -0.005903…
$ ret_wheat_fut <dbl> 0.0008980692, 0.0053715438, -0.0017873106, 0.0115608…
$ ret_KCWheat_fut <dbl> 0.0220892515, 0.0049529571, -0.0059464992, 0.0039682…
$ ret_rice_fut <dbl> 0.0083930387, 0.0053934919, 0.0174507579, 0.00813596…
$ ret_Feeder_Cattle <dbl> -0.0096783375, -0.0111522135, 0.0075628145, 0.011068…
$ ret_soymeal_fut <dbl> 0.0061281529, 0.0102224954, 0.0030190774, -0.0065988…
$ ret_soyF_fut <dbl> 0.0054513913, 0.0076457646, 0.0097900861, -0.0018874…
$ ret_soybeans_fut <dbl> 0.0099858421, 0.0081286732, -0.0052938051, -0.002834…
Plot all time series and their returns:
portfolioReturs |>
mutate(
time = seq_along( ret_corn_fut )
) |>
pivot_longer(
!time,
names_to = "Variables",
values_to = "Value"
) |>
group_by(Variables) |>
plot_time_series(
time,
Value,
.interactive = F, # Change for TRUE for better visualization
.facet_ncol = 2,
.smooth = FALSE
) +
theme(
strip.background = element_rect(fill = "white", colour = "white")
)Plotting the histograms:
portfolioPrices_df <- as_tibble(portfolioPrices, rownames = "date")
portfolioPrices_df$date <- ymd(portfolioPrices_df$date)
portfolioReturs_df <- na.omit( ROC( portfolioPrices ), type = "discrete" ) |>
as_tibble(rownames = "date")
portfolioReturs_df$date <- ymd(portfolioReturs_df$date)
colnames(portfolioReturs_df) <- c(
"date",
"ret_corn_fut",
"ret_wheat_fut",
"ret_KCWheat_fut",
"ret_rice_fut",
"ret_Feeder_Cattle",
"ret_soymeal_fut",
"ret_soyF_fut",
"ret_soybeans_fut"
)
# Remover a coluna com nome NA
portfolioReturs_df <- portfolioReturs_df[, !is.na(colnames(portfolioReturs_df))]
# Verificar novamente os nomes das colunas para garantir que estão corretos
colnames(portfolioReturs_df)[1] "date" "ret_corn_fut" "ret_wheat_fut"
[4] "ret_KCWheat_fut" "ret_rice_fut" "ret_Feeder_Cattle"
[7] "ret_soymeal_fut" "ret_soyF_fut" "ret_soybeans_fut"
portfolioReturs_long <- portfolioReturs_df |>
pivot_longer(
cols = -date, # Exclui a coluna de data
names_to = "fut_type",
values_to = "returns"
)
ggplot(portfolioReturs_long, aes(x = returns)) +
geom_histogram(aes(y = ..density..), binwidth = .01, color = "black", fill = "white") +
geom_density(alpha = .2, fill="lightgray") +
theme_minimal() +
theme(
axis.line = element_line(colour = "black"),
axis.text = element_text(colour = "black"),
axis.ticks = element_line(colour = "black"),
legend.position = c(.1,.9),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank()
) +
theme(plot.title = element_text(size = 10),
axis.title.x = element_text(size = 7),
axis.title.y = element_text(size = 7)) +
labs(x = "Returns", y = "Density") +
facet_wrap(~fut_type, scales = "free", ncol = 2) And finnaly, the last feature, is called, the conditional variance (risk measure), obtained by GARCH(1,1) model, formalized as:
The GARCH(1,1) model with asymmetric Student-t distribution can be represented mathematically as:
\[ r_t = \mu + \epsilon_t \]
\[ \epsilon_t = \sigma_t z_t, \quad z_t \sim t_{\nu}(0, 1) \]
\[ \sigma_t^2 = \omega + \alpha \epsilon_{t-1}^2 + \beta \sigma_{t-1}^2 \]
Where:
# Load necessary packages
library(rugarch)
# Define the GARCH(1,1) model specification with Student-t distribution
spec <- ugarchspec(
variance.model = list(model = "sGARCH", garchOrder = c(1, 1)),
mean.model = list(armaOrder = c(0, 0), include.mean = TRUE),
distribution.model = "std" # Using Student-t distribution
)
# Estimate the model for each asset in the portfolio and extract conditional variances
garch_models <- list()
conditional_variances <- list()
for (i in colnames(portfolioReturs)) {
garch_models[[i]] <- ugarchfit(spec, data = portfolioReturs[[i]])
conditional_variances[[i]] <- sigma(garch_models[[i]])^2
}
# Convert conditional variances list to a data frame
conditional_variances_df <- do.call(cbind, conditional_variances) %>%
as.data.frame() %>%
mutate(time = seq_along(conditional_variances[[1]]))
colnames(conditional_variances_df) <- c(
"cond_var_corn_fut",
"cond_var_wheat_fut",
"cond_var_KCWheat_fut",
"cond_var_rice_fut",
"cond_var_Feeder_Cattle",
"cond_var_soymeal_fut",
"cond_var_soyF_fut",
"cond_var_soybeans_fut",
"time"
)
# Reshape data for plotting
conditional_variances_long <- conditional_variances_df %>%
pivot_longer(!time, names_to = "Variables", values_to = "Value")And the plot of the conditional variance (risk):
Gujarati, D., N. (2004) Basic Econometrics, fourth edition, The McGraw−Hill Companies
Hair, J. F., Black, W. C., Babin, B. J., & Anderson, R. E. (2019). Multivariate Data Analysis. Pearson.
Hyndman, R.J., & Athanasopoulos, G. (2021) Forecasting: principles and practice, 3rd edition, OTexts: Melbourne, Australia. OTexts.com/fpp3. Accessed on oct 2023.
Para citar o pacote 'tidyverse' em publicações use:
Wickham H, Averick M, Bryan J, Chang W, McGowan LD, François R,
Grolemund G, Hayes A, Henry L, Hester J, Kuhn M, Pedersen TL, Miller
E, Bache SM, Müller K, Ooms J, Robinson D, Seidel DP, Spinu V,
Takahashi K, Vaughan D, Wilke C, Woo K, Yutani H (2019). "Welcome to
the tidyverse." _Journal of Open Source Software_, *4*(43), 1686.
doi:10.21105/joss.01686 <https://doi.org/10.21105/joss.01686>.
Uma entrada BibTeX para usuários(as) de LaTeX é
@Article{,
title = {Welcome to the {tidyverse}},
author = {Hadley Wickham and Mara Averick and Jennifer Bryan and Winston Chang and Lucy D'Agostino McGowan and Romain François and Garrett Grolemund and Alex Hayes and Lionel Henry and Jim Hester and Max Kuhn and Thomas Lin Pedersen and Evan Miller and Stephan Milton Bache and Kirill Müller and Jeroen Ooms and David Robinson and Dana Paige Seidel and Vitalie Spinu and Kohske Takahashi and Davis Vaughan and Claus Wilke and Kara Woo and Hiroaki Yutani},
year = {2019},
journal = {Journal of Open Source Software},
volume = {4},
number = {43},
pages = {1686},
doi = {10.21105/joss.01686},
}